Gapped BLAST and PSLBLAST: a new generation of protein database search programs
نویسندگان
چکیده
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs It approximately three times the speed of the original. n addition, a method is introduced for automatically mnbining statistically significant alignments proiuced by BLAST into a position-specific score matrix, ind searching the database using this matrix. The esulting Position-Specific Iterated BLAST (PSI3LAST) program runs a i approximately the same ;peed per iteration as gapped BLAST, but in many ases is much more sensitive to weak but biologically slevant sequence similarities. PSI-BLAST is used to Incover several new and Snteresting members of the 3RCT superfamily.
منابع مشابه
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of...
متن کاملSALSA: improved protein database searching by a new algorithm for assembly of sequence fragments into gapped alignments
MOTIVATION Optimal sequence alignment based on the Smith-Waterman algorithm is usually too computationally demanding to be practical for searching large sequence databases. Heuristic programs like FASTA and BLAST have been developed which run much faster, but at the expense of sensitivity. RESULTS In an effort to approximate the sensitivity of an optimal alignment algorithm, a new algorithm h...
متن کاملComparative accuracy of methods for protein sequence similarity search
MOTIVATION Searching a protein sequence database for homologs is a powerful tool for discovering the structure and function of a sequence. Two new methods for searching sequence databases have recently been described: Probabilistic Smith-Waterman (PSW), which is based on Hidden Markov models for a single sequence using a standard scoring matrix, and a new version of BLAST (WU-BLAST2), which use...
متن کاملA New Method for Database Searching: The BlastNP
BLAST r © (Basic Local Alignment Search Tool) is a set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. The standard nucleotide-nucleotide BLAST [blastN] has relatively low sensitivity because of the poor information density of nucleotide sequences. The standard protein-protein BLAST [blastP] is much mor...
متن کاملAdvanced Similarity Searches on the Web: Gapped BLAST, PSI- BLAST, FASTA 3.0 and INCA
In the ever changing world of bioinformatics, the two most popular programs for sequence similarity searching, Basic Local Alignment Search Tool (BLAST) and FASTA, have both recently been improved. BLAST Version 2.0 is now available at the National Center for Biotechnology Information (NCBI) Web site, and FASTA 3.0 is available both as free software for most computer systems and on several Web ...
متن کامل